Query Aspect Based Term Weighting Regularization in Information Retrieval
نویسندگان
چکیده
Traditional retrieval models assume that query terms are independent and rank documents primarily based on various term weighting strategies including TF-IDF and document length normalization. However, query terms are related, and groups of semantically related query terms may form query aspects. Intuitively, the relations among query terms could be utilized to identify hidden query aspects and promote the ranking of documents covering more query aspects. Despite its importance, the use of semantic relations among query terms for term weighting regularization has been under-explored in information retrieval. In this paper, we study the incorporation of query term relations into existing retrieval models and focus on addressing the challenge, i.e., how to regularize the weights of terms in different query aspects to improve retrieval performance. Specifically, we first develop a general strategy that can systematically integrate a term weighting regularization function into existing retrieval functions, and then propose two specific regularization functions based on the guidance provided by constraint analysis. Experiments on eight standard TREC data sets show that the proposed methods are effective to improve retrieval accuracy.
منابع مشابه
Web Information Retrieval using WordNet
Information retrieval (IR) is the area of study concerned with searching documents or information within documents. The user describes information needs with a query which consists of a number of words. Finding weight of a query term is useful to determine the importance of a query. Calculating term importance is fundamental aspect of most information retrieval approaches and it is traditionall...
متن کاملRelation Based Term Weighting Regularization
Traditional retrieval models compute term weights based on only the information related to individual terms such as TF and IDF. However, query terms are related. Intuitively, these relations could provide useful information about the importance of a term in the context of other query terms. For example, query “perl tutorial” specifies that a user look for information relevant to both perl and t...
متن کاملConcept based Web Information Retrieval
Information retrieval is concerned with documents relevant to a user’s information needs from a collection of documents. The user describes information needs with a query which consists of a number of words. Finding weight of a query is important to determine importance of a query. Calculating term importance is fundamental aspect of most information retrieval approaches and it is commonly dete...
متن کاملTerm Selection Term Selection Query - language Term Translation Doc - language Term Selection Term Weighting Term Matching Term Weighting Term Matching
This paper presents results for the Japanese/English cross-language information retrieval task on the NACSIS Test Collection. Two automatic dictionary-based query translation techniques were tried with four variants of the queries. The results indicate that longer queries outperform the required description-only queries and that use of the rst translation in the edict dictionary is comparable w...
متن کاملImproved Skips for Faster Postings List Intersection
Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2010